The KOJAK Group Finder: Connecting the Dots via Integrated Knowledge-Based and Statistical Reasoning
نویسندگان
چکیده
Link discovery is a new challenge in data mining whose primary concerns are to identify strong links and discover hidden relationships among entities and organizations based on low-level, incomplete and noisy evidence data. To address this challenge, we are developing a hybrid link discovery system called KOJAK that combines state-of-theart knowledge representation and reasoning (KR&R) technology with statistical clustering and analysis techniques from the area of data mining. In this paper we report on the architecture and technology of its first fully completed module called the KOJAK Group Finder. The Group Finder is capable of finding hidden groups and group members in large evidence databases. Our group finding approach addresses a variety of important LD challenges, such as being able to exploit heterogeneous and structurally rich evidence, handling the connectivity curse, noise and corruption as well as the capability to scale up to very large, realistic data sets. The first version of the KOJAK Group Finder has been successfully tested and evaluated on a variety of synthetic datasets.
منابع مشابه
INTEGRATING CASE-BASED REASONING, KNOWLEDGE-BASED APPROACH AND TSP ALGORITHM FOR MINIMUM TOUR FINDING
Imagine you have traveled to an unfamiliar city. Before you start your daily tour around the city, you need to know a good route. In Network Theory (NT), this is the traveling salesman problem (TSP). A dynamic programming algorithm is often used for solving this problem. However, when the road network of the city is very complicated and dense, which is usually the case, it will take too long fo...
متن کاملUsing KOJAK Link Discovery Tools to Solve the Cell Phone Calls Mini Challenge
We present a brief summary of the process and tools employed to generate a submission to the VAST-08 Cell Phone Calls Mini Challenge. The primary system used was KOJAK, which is an integrated suite of link discovery tools that support group detection [1], anomaly detection [4], pattern matching and graph simplification. KOJAK generally operates on data represented by semantic graphs where nodes...
متن کاملA Fully Integrated Range-Finder Based on the Line-Stripe Method
In this paper, an imaging chip for acquiring range information using by 0.35 μm CMOS technology and 5V power supply has been described. The system can extract range information without any mechanical movement and all the signal processing is done on the chip. All of the image sensors and mixed-signal processors are integrated in the chip. The design range is 1.5m-10m with 18 scales.
متن کاملThe role of divided attention and working memory in mathematical reasoning with the mediation of mathematical knowledge and fluid intelligence in fourth grade elementary students
The purpose of the research is modeling the relationship between divided attention and working memory in mathematical reasoning with the mediation of mathematical knowledge and fluid intelligence in fourth grade elementary students. Statistical population of the research included all fourth grade male students of Primary schools in District 4 of Qom from which 213 students were randomly selecte...
متن کاملMaximum Maintainability of Complex Systems via Modulation Based on DSM and Module Layout.Case Study:Laser Range Finder
The present paper aims to investigate the effects of modularity and the layout of subsystems and parts of a complex system on its maintainability. For this purpose, four objective functions have been considered simultaneously: I) maximizing the level of accordance between system design and optimum modularity design,II) maximizing the level of accessibility and the maintenance space required,III...
متن کامل